Active learning (machine learning)

Active learning is a special case of machine learning in which a learning algorithm can interactively query a human user (or some other information source), to label new data points with the desired outputs. The human user must possess knowledge/expertise in the problem domain, including the ability to consult/research authoritative sources when necessary. ^[1]^[2]^[3] In statistics literature, it is sometimes also called optimal experimental design.^[4] The information source is also called teacher or oracle.

There are situations in which unlabeled data is abundant but manual labeling is expensive. In such a scenario, learning algorithms can actively query the user/teacher for labels. This type of iterative supervised learning is called active learning. Since the learner chooses the examples, the number of examples to learn a concept can often be much lower than the number required in normal supervised learning. With this approach, there is a risk that the algorithm is overwhelmed by uninformative examples. Recent developments are dedicated to multi-label active learning,^[5] hybrid active learning^[6] and active learning in a single-pass (on-line) context,^[7] combining concepts from the field of machine learning (e.g. conflict and ignorance) with adaptive, incremental learning policies in the field of online machine learning. Using active learning allows for faster development of a machine learning algorithm, when comparative updates would require a quantum or super computer.^[8]

Large-scale active learning projects may benefit from crowdsourcing frameworks such as Amazon Mechanical Turk that include many humans in the active learning loop.

^ Settles, Burr (2010). "Active Learning Literature Survey" (PDF). Computer Sciences Technical Report 1648. University of Wisconsin–Madison. Retrieved 2014-11-18.
^ Rubens, Neil; Elahi, Mehdi; Sugiyama, Masashi; Kaplan, Dain (2016). "Active Learning in Recommender Systems". In Ricci, Francesco; Rokach, Lior; Shapira, Bracha (eds.). Recommender Systems Handbook (PDF) (2 ed.). Springer US. doi:10.1007/978-1-4899-7637-6. hdl:11311/1006123. ISBN 978-1-4899-7637-6. S2CID 11569603.
^ Das, Shubhomoy; Wong, Weng-Keen; Dietterich, Thomas; Fern, Alan; Emmott, Andrew (2016). "Incorporating Expert Feedback into Active Anomaly Discovery". In Bonchi, Francesco; Domingo-Ferrer, Josep; Baeza-Yates, Ricardo; Zhou, Zhi-Hua; Wu, Xindong (eds.). IEEE 16th International Conference on Data Mining. IEEE. pp. 853–858. doi:10.1109/ICDM.2016.0102. ISBN 978-1-5090-5473-2. S2CID 15285595.
^ Olsson, Fredrik (April 2009). "A literature survey of active machine learning in the context of natural language processing". SICS Technical Report T2009:06.
^ Cite error: The named reference multi was invoked but never defined (see the help page).
^ Cite error: The named reference hybrid was invoked but never defined (see the help page).
^ Cite error: The named reference single-pass was invoked but never defined (see the help page).
^ Novikov, Ivan (2021). "The MLIP package: moment tensor potentials with MPI and active learning". IOP Publishing. 2 (2): 3, 4. arXiv:2007.08555. doi:10.1088/2632-2153/abc9fe – via IOP science.

[settles-1] Settles, Burr (2010). "Active Learning Literature Survey" (PDF). Computer Sciences Technical Report 1648. University of Wisconsin–Madison. Retrieved 2014-11-18.

[rubens2016-2] Rubens, Neil; Elahi, Mehdi; Sugiyama, Masashi; Kaplan, Dain (2016). "Active Learning in Recommender Systems". In Ricci, Francesco; Rokach, Lior; Shapira, Bracha (eds.). Recommender Systems Handbook (PDF) (2 ed.). Springer US. doi:10.1007/978-1-4899-7637-6. hdl:11311/1006123. ISBN 978-1-4899-7637-6. S2CID 11569603.

[das2016-3] Das, Shubhomoy; Wong, Weng-Keen; Dietterich, Thomas; Fern, Alan; Emmott, Andrew (2016). "Incorporating Expert Feedback into Active Anomaly Discovery". In Bonchi, Francesco; Domingo-Ferrer, Josep; Baeza-Yates, Ricardo; Zhou, Zhi-Hua; Wu, Xindong (eds.). IEEE 16th International Conference on Data Mining. IEEE. pp. 853–858. doi:10.1109/ICDM.2016.0102. ISBN 978-1-5090-5473-2. S2CID 15285595.

[olsson-4] Olsson, Fredrik (April 2009). "A literature survey of active machine learning in the context of natural language processing". SICS Technical Report T2009:06.

[multi-5] Cite error: The named reference multi was invoked but never defined (see the help page).

[hybrid-6] Cite error: The named reference hybrid was invoked but never defined (see the help page).

[single-pass-7] Cite error: The named reference single-pass was invoked but never defined (see the help page).

[8] Novikov, Ivan (2021). "The MLIP package: moment tensor potentials with MPI and active learning". IOP Publishing. 2 (2): 3, 4. arXiv:2007.08555. doi:10.1088/2632-2153/abc9fe – via IOP science.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]